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1 Introduction 


Sentiment indicators are often considered to be among the most important leading indicators 


of the real economy (Dreger and Kholodilin, 2013) and are therefore closely followed by busi¬ 


ness cycle analysts, central banks and business owners (Vuchelen, 2004, Claveria et ah, 2007 


Martinsen et ah, 2014). However, studies on the predictive power of sentiment indicators 


hnd mixed results. While many studies hnd that sentiment indicators have predictive power 


for future economic developments (Kumar et al., 1995, Hansson et al., 2005, Lemmens et al. 


2005, Abberger, 2007, Klein and Oezmucur, 2010, Christiansen et ah, 2014), others conclude 


that sentiment indicators provide only limited information for predicting economic variables 


Cotsomitis and Kwan 

2006, 

Claveria et al. 

2007, 

Dreger and Kholodilin 

2013 

and 

Bruno 


2014). 


An important communality between these studies is the use of aggregate sentiment in¬ 
dicators. This paper, instead, examines the predictive power of disaggregate sentiment 
indicators. Especially in the context of business sentiment - as is the topic of this paper - 
some segments have more predictive power than others. Here, we segment hrms according 
to their industry. Our methodology takes into account that the different industry segments 
might contain predictive power for different macro-economic indicators. 

To study the predictive power, we use a Granger Causality approach. A (set of) time 
series is said to Granger Cause another time series if the former has incremental predictive 
power for predicting the latter. Granger Causality tests in low-dimensional time series set¬ 
tings have a long history. They are used, among others, in macro-economics to study the 


predictive power of monetary aggregates for output and price variables (Sahoo and Acharya 


2010), in operational research to study the predictive power of academic literature for prac¬ 


titioner literature (Ghosh et al. [2010), or in hnance to study the predictive power of volume 


for stock prices (Blasco et ah, 2005). Because predictive analysis based on disaggregate 
sentiment indicators requires handling a large number of such indicators, we introduce a 
Granger Causality testing procedure applicable to high-dimensional time series. 

Recently, a small but growing literature on inference in penalized regression models for 
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cross-sectional data has arisen, such as Wasserman and Roeder (2009), Meinshausen et ah 


(2009) and Chatterjee and Lahiri (2011). We extend the residual bootstrap procedure of 


Chatterjee and Lahiri (2011) to high-dimensional time series data. The bootstrap test 


statistic, based on the Adaptive Lasso ( |Zou 2006), identihes those industry segments whose 
predictive power is statistically significant. Our simulation study shows that this test statis¬ 
tic is more powerful than the standard Wald test statistic in a high-dimensional setting. 
Furthermore, important gains in forecast accuracy are obtained by not using all industry 
segments but by first selecting the most predictive ones using the bootstrap test statistic. 

We use a unique data set that not only measures the sentiment of hrms towards their own 
situation business sentiment”) - as is classical for sentiment indicators - but also measures 
the sentiment of hrms towards the banking industry {^^bank sentiment”). For the economy to 
be able to grow, it is essential that hrms have access to credit, typically provided by banks. 
Especially in the aftermath of the recent economic downturn and banking crises, distressed 


banks can constrain the economy (Kroszner et ah, 2007, Dell’Ariccia et ah, 2008, Fernandez 


et ah, 2013). To the best of our knowledge, we are the hrst to study the importance of 


sentiment towards the banking industry. 

The remainder of this article is structured as follows. Section [2] describes the data on the 
business and bank sentiment, as well as the macro-economic indicators. Section [^introduces 
Granger Causality Testing in high-dimensional time series models. In Section [^ a simulation 
study shows the good performance of our methodology in terms of size and power of the test 
statistic and forecast accuracy. In Section we apply the proposed methodology to identify 
the most predictive industry segments for several future macro-economic indicators. In 
Section!^ we show that forecast accuracy can be improved by using only the most predictive 
industry segments instead of all industry segments. Finally, Section concludes. 
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Table 1: Industry Segments. Businesses are divided into 10 industry segments. 


Industry 

Description 

Sector 

Industry 1 

Agriculture, forestry, fishing, mining and quarrying and other industry 

Primary 

Industry 2 

Manufacturing 

Secondary 

Industry 3 

Construction 

Secondary 

Industry 4 

Wholesale and retail trade, transportation and storage accomodation and food and service activities 

Tertiary 

Industry 5 

Information and communication 

Quaternary 

Industry 6 

Financial and insurance activities 

Quaternary 

Industry 7 

Real estate activities 

Quaternary 

Industry 8 

Professional, scientific, technical administration and support service activities 

Quaternary 

Industry 9 

Public administration, defence, education. 

Quaternary 

Industry 10 

Other services 

Quaternary 


2 Data 

We use a unique data set provided to us by EUWIFO, the European Economic Research 
Institute. EUWIFO is an owner-managed business that conducts business climate interviews. 
By conducting interviews with hrms spread over Germany, EUWIFO gathers information on 
the conhdence these hrms have in their own economic situation and in the banking sector. 
Firms are divided into segments according to the industry in which they are active based on 
their NACE code. These 10 industry segments are listed in Table 

The interviews consist of two parts. In the hrst part, the Business Survey, hrms are asked 
to assess their own situation. In the second part, the Bank Survey, hrms are asked to assess 
the German bank sector. 


Business Survey Each hrm receives 9 questions to assess their own economic situation. 
They are asked to assess changes (this year compared to last year) in (1) turnover, (2) earn¬ 
ings, (3) number of employees, (4) investments, (5) incoming domestic orders, (6) incoming 
foreign orders, (7) utility and maintenance costs, (8) tax burden, and (9) cost through gov¬ 
ernment red tape. For each question, answers are favorable, neutral or unfavorable. For all 
the hrms within an industry segment, a balance of opinion indicator is calculated for each 
question, being the percentage of favorable answers minus the percentage of unfavorable 
answers. As we construct 9 sentiment indicators for each of the 10 industries, this amounts 
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Table 2: Macro-economic indicators. All time series are seasonally adjusted (Eurostat). 

Indicator Description 

IP-Al Production in industry: Mining and quarrying; manufacturing; electricity, gas, steam and air conditioning supply 

IP-A2 Production in industry; Construction, Mining and quarrying; manufacturing; electricity, gas, steam and air conditioning supply 

IP-M Production in industry; Manufacturing 

IP-E Production in industry; Energy 

IP-CaGo Production in industry; Capital goods 

IP-CoGo Production in industry; Consumer goods 

RT Retail Trade, except of motor vehicles and motorcycles 

WS Wholesale Trade, except of motor vehicles and motorcycles 


to 90 business sentiment indicators. 

Bank Survey Each firm is asked to assess the German bank sector. In total, 243 German 
banks are included in the Bank Survey. Each firm first has to indicate which of these 243 
German banks they know. For the banks they know, they are asked to assess their consider¬ 
ation towards that specific bank and the reputation of that specific bank. Answers are either 
favorable or unfavorable and a balance of opinion indicator is calculated for each question. 
We include three indicators: the average consideration indicator, averaged over all German 
banks, the consideration indicator towards the Sparkassen, and the consideration indicator 
towards the Volksbanken. The latter two are the most well known banks in Germany. We 
also construct three reputation indicators per industry segment following an analogous ap¬ 
proach. As we construct three bank consideration and three bank reputation indicators for 
each of the 10 industries, this amounts to 60 bank sentiment indicators. 

Joining the 90 business sentiment indicators and the 60 bank sentiment indicators results 
in a total of 150 time series. We combine all 150 sentiment indicators in one high-dimensional 
data set. All time series are observed over T = 40 months (January 2012-April 2015). 
We study the predictive power of these sentiment indicators for 8 German macro-economic 
indicators (Table [^. 

The 150 time series are grouped into blocks by industry segment (cfr. Table [^. For each 
industry segment, we have one block of 9 indicators from the Business Survey and one block 
of 6 indicators from the Bank Survey. Our methodology is such that we select either all 9 
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business sentiment indicators for an industry, or none. Similarly, we will select either all 6 
bank sentiment indicators for an industry or none. This way, we can investigate the difference 
in predictive power between the business and bank sentiment indicators for the 10 industries. 
To identify the most predictive blocks, we perform joint hypothesis tests. We test if the set 
of indicators in a particular block Granger Causes a particular macro-economic indicator. 
This predictive analysis involves a large number of disaggregate sentiment indicators. In the 
next section, we introduce a Granger Causality testing procedure that can handle such a 
high-dimensional situation. 

3 High-dimensional Granger Causality Testing 

Performing Granger Causality tests on a data set with many time series relative to the 
length of the series is challenging. In these high-dimensional settings, estimation by standard 
procedures becomes inaccurate. In our sentiment application, the number of time series (i.e. 
k = 150) even exceeds the length of the time series (i.e. 40), making it impossible to use 
standard estimation procedures. Penalized estimation brings an outcome. 

3.1 Penalized Maximum Likelihood estimation 

Let Ut be a one-dimensional stationary time series. We assume that yt follows a ARX(p) 
model, i.e. an autoregressive model of order p with k predictor time series collected in the 
{k X 1) vector x*: 

yt = hyt-i b2yt-2 + • • • + bpyt-p + aix^.i -h a2Xi_2 -h ... -h a.pXt-p + e*, ( 1 ) 

where bi to bp are the autoregressive parameters, the parameters ai to a^ are (1 x k) vectors 
and the error term e* is assumed to follow a A^(0 ,(t) distribution. We assume, without loss 
of generality, that all time series are mean centered such that no intercept is included. 

If the number of components in Xj is large, the number of unknown parameters in equa¬ 
tion ([^ explodes. To ensure accurate estimation, we use Penalized Maximum Likelihood 
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estimation (e.g. Zou| |2006| in a regression context, or Gelper et al., 2015 in a time series 
context). Write model ([^ in matrix notation as 


y = X/3 + e , 


( 2 ) 


where y is the column vector (yi,..., ?/r), and the matrix X = (Y^,..., Y^, X^,..., X^). 
Here Y^ is (T x 1), containing the values of the time series at lag j in its column; and X^ is 
an (T X k) matrix, containing the values of the k predictor time series at lag j in its columns, 
for I < j < p. The vector (3 contains the parameters values 6i,..., ai,..., Up, and has 
length p(l + k). In case p(l + fc) > T, the Maximum Likelihood estimator does not exist. 
The Penalized Maximum Likelihood estimator is, however, still computable. 

The penalized estimator of the regression parameter (3 is obtained by minimizing the 
negative log likelihood with a penalization on the elements of (3\ 


p{l+k) 


= argmin -(y - X/3)'(y - X/3) + A V Wi\l3i \, 

d 2 ^ 


( 3 ) 


2=1 


where Wi are weights and A > 0 is a sparsity parameter. This estimator is the Adaptive 


Lasso (Zou, 2006). It generalizes the popular Lasso (e.g. Hastie et ah, 2009, Chapter 3) 


which shows good performance in operational research (e.g. Ballings and Van den Poel 


2015, Huang et ah, 2014). The Adaptive Lasso ensures that the bootstrap (Section 3.3) 


is consistent 

= I/IA"''! 

Chatterjee and Lahiri, 2011 

1 . We take the weights of the Adaptive Lasso 

, where the Ridge estimator I 

Hastie et al. 

2009 

Chapter 3) is 


= argmin;^(y - X/3)'(y - X/3) + Andge Y] Pi- 
p J- ^ 


p{l+k) 


2=1 


The sparsity parameter A and the order of the ARX, p, are selected using the Bayesian 


Information Criterion (BIC) (e.g. Abegaz and Wit, 2013 and references therein): 


BICa = T ■ log ( -(y - X/3A)'(y - X/3 a) ) + cZ/a ■ log(T), 


where dfx equals the number of non-zero estimated regression coefficients. We solve (|^ over 
a range of values for A and select the one with lowest value of the BIC. To select the order of 
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the ARX model, we estimate the ARX model for different values of p, each time using the 
optimal value of A for that value of p. We then select the order p of the ARX model again 
by minimizing the BIG. 


3.2 Granger Causality in the ARX framework 


We partition the vector in different blocks, and denote the block of by x^j, con¬ 
sisting of kj time series. In the ARX model Q, denote the block of coefficients at lag i 
corresponding to x^j by aj The multivariate time series x^j is said to Granger Gause ip if 
the former has incremental predictive power for the latter. We say that xt ^ does not Granger 
Gause yt if the coefficients on all lags of xt^j are equal to zero, i.e. aij = ... = a^j = 0. 

The Adaptive Lasso estimator in ([^ is sparse, meaning that some of its elements are 
exactly zero. The larger the value of A, the sparser the estimator. The “Granger Lasso 


Selection” method (e.g. Fujitaet ah, 2007, Bahadori and Liu, 2013) says that a time series xj 


Granger Gauses pt if at least one of the corresponding parameters aij ,..., Sipj is estimated 
as non-zero. Our approach is different, we infer Granger Gausality relations from a bootstrap 
testing procedure. 


3.3 Granger Lasso test 


The null hypothesis that a block of time series x^ ^ is not Granger Gausing can be stated 
as 

Ho ■ Rj/3 = 0, (4) 

where Rj is a suitable pkj x p(l -|- k) matrix. The elements of Hj are either zero or one. We 
assign the value one to the elements of Rj corresponding to the autoregressive parameters 
aij,..., a.pj. The corresponding Wald test statistic is given by 

Q = (Rj3)'(R,Gov(3)R')"^(Rj3). (5) 


To bootstrap this test statistic, we use the following residual bootstrap procedure (Kreiss 


and Lahiri, 2012); 













1. Estimate the model under the null hypothesis, i.e. model ([^ with the block removed 
at the right-hand-side. Compute the centered residuals Si, for t = 1,..., T. 


2. Let B = 500 be the number of bootstraps. For b = 

(a) Construct the bootstrap time series yl from model ([^ with the parameter esti¬ 
mates from step 1 and with bootstrap errors = eut with Ut,t = 1,... ,T an 
i.i.d. sequence of discrete random variables uniformly distributed on {1,... ,T}. 
The predictor time series are kept hxed. 

(b) Apply the Penalized Maximum Likelihood estimator of equation (|^ to the boot- 
strap sample. Denote the bootstrap estimate by /3^. 

(c) Compute the bootstrap statistic Ql = (Rj/3^)'(RjCov(/3)R')“^(Rj/3^). 


3. Compute 


B 


b=l 


midp-value = — ^ (j(g* > Q) + -I{Ql = Q)), 


with Ql (for b = 1,B) B independent bootstrap statistics. /(■) is an indicator 
function that takes on the value one if its argument is true and equals zero otherwise. 


We use the mid p-value (Lancaster, 1949) since it may occur that the value of the test 


statistic and the bootstrap test statistic are both equal to zero. 


4 Simulation study 

By means of a simulation experiment, we (i) evaluate the size and power of the Granger 
Lasso test and (ii) conduct a forecast exercise. We generate yt according to the following 
ARX(l) model 

yt = 0.5pt_i -h aixt_i + e*, (6) 

where Ct ~ W(0, 0.1). The predictors are generated as autoregressive processes x^ = Cxt_i -|- 
Ui, with Ui ~ Wfc(0,0.1I), C = 0.51 and I the fc-dimensional identity matrix. The model 
parameters are chosen according to the four designs detailed in Table The hrst three 
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Table 3: Simulation designs. 


Design under Hq under 

T = 100, fc = 25 ai = |^0.2 ix5 0ix5 0ix5 Oix()d-i 5)] = |^0.2ix5 0.2ix5 Ojxs 0ix(s:-i5)] 

T = 100, fc = 50 ai = |^0.2 ix5 0ix 5 Oixs Oix()d-i 5)] = |^0.2ix5 0.2ix5 Ojxs 0ix(S:-i5)] 

T = 100, fc = 75 ai = |^0.2 ix5 0ix 5 Oixs Oix()d-i 5)] = |^0.2ix5 0.2ix5 Oixs Oix(S:-i5)] 

T = 40, = 150 ai = |^0.4 ix 9 Oixg ... Oixg Oixe ... Oixgj ai = |^o.4j,^g 0.4ix9 Oixg ... Oixg Oixe .. Oixe] 

designs are the same except for the number of time series k. In design two and three, we add 
more non-informative time series to the model, i.e. time series with a coefficient equal to zero. 
The standard Maximum Likelihood estimator is computable in these three designs. The last 
design corresponds to the design of our sentiment application, with k = 150 predictor time 
series and T = 40. Here, only the Penalized Maximum Likelihood estimator is computable. 

For each design, we consider a data generating process under the null hypothesis Hq and 
under the alternative hypothesis Ha- We divide the time series x* and the corresponding 
coefficient vector ai into several blocks, as can be seen from Table The hrst block of time 
series Granger Cause the response both under Hq and under Ha- The second block of time 
series Granger Cause the response only under Ha- The remaining blocks of time series never 
Granger Cause the response. In the hrst three designs, block one to three each contain hve 
time series, the fourth block contains the remaining ones. In the last design, there are 20 
blocks, similar to our sentiment application. 

4.1 Size and power of the test statistic 

We test the null hypothesis that the second block of time series does not Granger Cause the 
response. We compare the performance of Granger Lasso test to the standard Wald test 
computed from the standard Maximum Likelihood (ML) estimator. 

To study the size of the test statistic, we simulate N = 1000 time series under the null 
hypothesis and compute the simulated size, i.e. the proportion of simulation runs were the 
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Table 4: Simulated sizes for the Wald test and Granger Lasso test. 
Simulation design Wald test Granger Lasso test 


a = 0.01 a = 0.05 a = 0.01 a = 0.05 


T = 

100, = 25 

0.017 

0.064 

0.013 

0.058 

T = 

100,/c = 50 

0.025 

0.079 

0.010 

0.052 

T = 

100,/e = 75 

0.035 

0.082 

0.015 

0.051 

T = 

40, h = 150 

NA 

NA 

0.007 

0.051 


null hypothesis is rejected: 

1 ^ 

Simulated size = — < ct); C^) 

i=i 

where is the mid p-value obtained in simulation run j = 1,... and a is the pre- 
specihed signihcance level. We consider a = 0.01 and a = 0.05. 

Results. Table shows the simulated sizes for the standard Wald test and the Granger 
Lasso test. The simulated sizes of the Granger Lasso test and the standard Wald test are 
both close to the nominal size a in the design with T = 100, k = 25. When the number of 
time series increases relative to the length of the time series (i.e. second and third design), 
the Granger Lasso test remains accurately sized whereas the standard Wald test statistic 
gets distorted: its simulated size deviates strongly from the nominal size. In the last design, 
only the Granger Lasso test is available. For both a = 0.01 and a = 0.05, the Granger Lasso 
test is reasonably accurately sized. 


To study the power of the test statistic, we use size-power curves (see Davidson and 


MacKinnon , 1998). Size-power curves are constructed using two empirical distribution func¬ 


tions. We carry out the following steps: 


1. Simulate N = 1000 time series under the null hypothesis. Gompute for each simulation 
run j = 1,..., iV the mid p-value p^°. Galculate the empirical distribution function of 
the p-values: 

1 ^ 
i=i 
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Size-power curve: T=100, k=25 


Size-power curve: T=100, k=50 


Size-power curve: T=100, k=75 





Figure 1: Size-power curve of the Granger Lasso test (solid gray line) and the standard Wald 
test (dashed line), for increasing number of time series k = 25 (left), k = 50 (middle) and 
k = 75 (right) with time series length T = 100. The 45°hne is indicated as well. 

for a grid of values Xi,i = 1,... ,m between zero and one. 

2. Simulate N = 1000 time series under the alternative hypothesis. Compute for each 
simulation run j = 1,... ,N the mid p-value Calculate 

1 ^ 

^ < Xi). 

i=i 

3. Plot F^°{xi) against F^^{xi), for Xi,i = 1,... ,m. 

Results. Size-power curves of the Granger Lasso test and standard Wald test are shown 
in Figure(hrst three designs). The larger the difference between the size-power curve and 
the 45°line, the more power the test has. For k = 25 (i.e. left panel) both curves are rapidly 
increasing and very similar. When the number of time series increases (i.e. middle and right 
panel), the size-power curve of the Granger Lasso test is hardly affected, and achieves a 
much larger power than the standard Wald test. 
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4.2 Forecast exercise 


For forecasting the time series yt, we use a two-step procedure. First, we select predictor 
time series. Second, we estimate the model with only the selected predictor time series. We 
consider four selection and four estimation techniques, yielding 16 selection-estimation com¬ 
binations. We investigate the performance of each combination in forecasting the response. 

As selection techniques we consider: (1) use all time series, (2) use the standard Wald test 
to discard blocks of time series that are not Granger Causing the response, (3) use Granger 


Lasso Selection (cfr. Section 3.1) to discard blocks of time series that are not Granger 
Causing the response, (4) use the Granger Lasso test to discard blocks of time series that 
are not Granger Causing the response. Selection technique (4) is our proposed selection 
technique. The tests are carried out at a 1% significance level. 

After selecting the predictor time series, we forecast the response using either (1) Max¬ 
imum Likelihood, (2) the Adaptive Lasso estimator, (3) Bayesian shrinkage with the Min¬ 


nesota prior (Litterman, 1986), (4) the Factor Model of Stock and Watsc^ (2002). These 


are all leading methods for macro-economic forecasting (Inoue and Kilian, 2008). Methods 
(2) and (3) perform shrinkage. Where the Adaptive Lasso puts some of the estimated coeffi¬ 
cients exactly to zero, the Bayesian estimator only shrinks the estimated coefficients towards 
zero. Factor Models reduce the dimension of the predictor time series by extracting a small 
number of common factors using principal component analysis]^ 

To evaluate forecast accuracy, we conduct a rolling window forecast exercise. We use a 
window of size S = [0.90-Tj. At each point t = S,..., T — 1, the models are re-estimated and 
one-step-ahead forecasts are calculated. We evaluate the forecast accuracy of each selection- 
estimation technique combination by calculating the Mean Absolute Forecast Erroij^ 

. T-l 


MAFE = 


-yt+i\, 


T-S 


( 8 ) 


t=s 


Mhe number of factors r is determined by calculating the maximum eigenvalue ratio criterion fj = 

Xj /Aj+i for j = 1,..., k — 1 from the eigenvalues A^-,..., Ak and selecting r = argmax^G. 

^Similar conclusions can be drawn by looking at the Mean Squared Forecast Error. 
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Table 5: Average MATE for the four selection techniques (rows) and four estimation tech¬ 
niques (columns). 


Simulation design 

Selection technique 

ML 

Estimation technique 

Adaptive Lasso Bayesian 

Factor Model 

T = 

100, fc = 25 

All 

0.093 

0.089 

0.116 

0.129 



Wald test 

0.082 

0.082 

0.121 

0.086 



Granger Lasso Selection 

0.089 

0.085 

0.118 

0.121 



Granger Lasso test 

0.082 

0.082 

0.120 

0.086 

T = 

100, fc = 50 

All 

0.126 

0.092 

0.122 

0.138 



Wald test 

0.087 

0.084 

0.124 

0.089 



Granger Lasso Selection 

0.119 

0.092 

0.122 

0.137 



Granger Lasso test 

0.084 

0.083 

0.124 

0.086 

T = 

100, fc = 75 

All 

0.208 

0.089 

0.123 

0.141 



Wald test 

0.117 

0.088 

0.121 

0.107 



Granger Lasso Selection 

0.170 

0.091 

0.123 

0.140 



Granger Lasso test 

0.083 

0.080 

0.119 

0.085 

T = 

40, k = 150 

All 

NA 

0.189 

0.315 

0.322 



Granger Lasso Selection 

NA 

0.181 

0.305 

0.300 



Granger Lasso test 

NA 

0.165 

0.379 

0.199 


where yt+i is the predicted response for time The MAFE is computed for each simulated 
time series, and their average over N = 100 simulation runs is reported in Table 

Results. Table shows that selecting predictor time series is better than taking all 
series, for all estimation techniques (except the Bayesian shrinkage estimator). Among the 
selection techniques, improvements are larger with our Granger Lasso test compared to the 
Granger Lasso Selection approach. Granger Lasso Selection discards less blocks of time series 
compared to the Granger Lasso test, yielding less parsimonious models and reduced forecast 
performance. When the number of time series increases relative to the length of the time 
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series, the Granger Lasso test also performs substantially better than the standard Wald 
test. Paired t-tests conhrm that (in the majority of cases), the improvements of the Granger 
Lasso test compared to the other selection techniques are signihcant. More precisely, the 
good performance of the Granger Lasso test is most pronounced in the high-dimensional 
designs: it performs signihcantly best - among the four selection techniques - in 8 out of 12 
cases (design T = 100, k = 50), 12 out of 12 cases (design T = 100, k = 75), and 6 out of 9 
cases (design T = 40, k = 150). 

For all simulation designs, the best forecast always involves the Granger Lasso test. 
Among the estimation techniques, the Adaptive Lasso performs best. After the hrst selection 
of predictive blocks of time series, the Adaptive Lasso can further reduce the number of 
predictor time series in the second step. This is most suited for settings with a few number 
of relevant predictor time series and a large number of irrelevant, noise predictor time series. 


Similar conclusions are obtained by Biihlmann and Hothorn (2010) who discuss a “Twin 
Boosting” procedure for improved feature selection and prediction. 


5 The role of business and bank sentiment for macro- 
economic forecasting 

We identify the most predictive industry segments for future macro-economic developments 
using the Granger Lasso test from Section 

5.1 Model 

We estimate 8 ARX models, one for each macro-economic indicator to predict. The time 
series yt entering model ([^ is one of the 8 macro-economic indicators of Table taken in 
hrst differences. The vector xt contains the k = 150 business and bank sentiment indicators 
in hrst diherences at time t. We use diherences to ensure stationarity of the time series]^ We 
^Following standard practice, we first test for stationarity. A stationarity test of all individual time series 
using the Augmented Dickey-Fuller test indicates that most time series in levels are integrated of order 1. 
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estimate each ARX model using the Penalized Maximum Likelihood estimator from Section 
Then, we perform Granger Causality tests, one for each of the 20 blocks of sentiment 
indicators (cfr. Section]^. As such, we test if the opinion of a particular industry segment - 
as measured through the Business Survey - has incremental predictive power for the German 
macro-economic indicators. We repeat this exercise for each industry segment using the 
Bank Survey. 


5.2 Identifying the most predictive industries 

For each industry. Table reports the p-value of the test that the opinion of that particular 
industry does not Granger Cause a particular macro-economic indicator. Signihcant results 
at the 1% level are in bold. We discuss the results by building on the sectoral classihcation 
framework which distinguishes the primary, secondary, tertiary and quaternary sector. 


Business Survey. The primary sector, unlike the other sectors, has almost no incre¬ 
mental predictive power. The primary sector’s contribution to Germany’s GDP is also the 
smallest. The secondary industry has most incremental predictive power for the macro- 
economic indicators to which these sectors contribute most (IP-Al, IP-A2, IP-M and IP-E). 
Firms active in the tertiary and especially the quaternary sector have incremental predictive 
power for several macro-economic indicators. This sector consists of the knowledge-based 
part of the economy, and accounts for roughly 65% of Germany’s GDP. Firms active in these 
sectors are at the heart of the whole economy. 


Bank Survey. The Bank Survey contains less incremental predictive power than the 
Business Survey. The predictive power of bank sentiment for predicting future macro¬ 


economic developments is limited. This is in line with Dell’Ariccia et ah (2008) who hnd 
that the real effects of a banking crisis are limited in developed countries, in countries that 
have more access to foreign hnancing, and countries where banking crises are less severe, 
which all apply to Germany. 
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Table 6: P-values of the Granger Causality test with null hypothesis that the opinion of a 

particular industry segment (rows) does not Granger Cause a particular macro-economic indicator 
(columns). Significant results at the 1% level are in bold. 



Industry segment 

Sector 

IP-Al 

IP-A2 

Macro-economic indicators 

IP-M IP-E IP-CaG IP-CoG RT 

ws 

Business 

Agriculture, mining &: other industry 

Primary 

0.03 

0.04 

0.03 

0.99 

0.01 

0.01 

0.01 

0.84 

Survey 

Manufacturing 

Secondary 

0.01 

0.07 

0.00 

0.00 

0.00 

0.01 

0.00 

0.37 


Construction 

Secondary 

0.01 

0.00 

0.01 

0.04 

0.00 

0.70 

0.00 

0.50 


Wholesale, retail trade, transportation, food &; service 

Tertiary 

0.02 

0.00 

0.04 

0.01 

0.02 

0.923 

0.27 

0.06 


Information & communication 

Quaternary 

0.92 

0.02 

0.90 

0.00 

0.02 

0.50 

0.04 

0.04 


Finance 

Quaternary 

0.56 

0.03 

0.13 

0.00 

0.06 

0.04 

0.13 

0.39 


Real estate 

Quaternary 

0.96 

0.84 

0.26 

0.01 

1.00 

0.00 

0.00 

0.60 


Administration &; supjDort 

Quaternary 

0.01 

0.03 

0.01 

0.00 

0.00 

0.01 

0.21 

0.00 


Public services 

Quaternary 

0.00 

0.02 

0.23 

0.04 

0.00 

0.02 

0.86 

0.04 


Other services 

Quaternary 

0.05 

0.00 

0.01 

0.00 

0.00 

0.07 

0.66 

0.12 

Bank 

Agriculture, mining &; other industry 

Primary 

1.00 

1.00 

1.00 

0.59 

1.00 

0.92 

0.86 

0.90 

Survey 

Manufacturing 

Secondary 

0.05 

0.20 

0.06 

1.00 

0.99 

0.14 

0.85 

0.39 


Construction 

Secondary 

0.82 

0.82 

0.92 

0.01 

1.00 

0.70 

0.84 

0.03 


Wholesale, retail trade, transportation, food &; service 

Tertiary 

1.00 

0.76 

0.98 

1.00 

0.00 

0.04 

0.53 

0.23 


Information &: comniimication 

Quaternary 

0.72 

0.02 

0.09 

1.00 

0.04 

0.53 

0.05 

0.79 


Finance 

Quaternary 

0.98 

1.00 

1.00 

0.01 

1.00 

0.40 

0.09 

0.08 


Real estate 

Quaternary 

0.76 

0.90 

0.60 

1.00 

1.00 

0.73 

0.80 

0.62 


Administration h support 

Quaternary 

0.01 

0.29 

0.00 

1.00 

0.80 

0.78 

0.68 

0.00 


Public services 

Quaternary 

0.03 

0.07 

0.01 

0.03 

0.03 

0.03 

0.03 

0.05 


Other services 

Quaternary 

0.46 

0.77 

0.82 

0.47 

0.69 

0.05 

0.16 

0.98 


5.3 Robustness checks 

Our main research question is whether the sentiment of different industry segments has 
predictive power for macro-economic indicators. Our methodology is also applicable to other 
ways of segmenting hrms, as region in which the are located or according to their company 
size. For our data, there are 10 regions and three company sizes. We re-estimate the 8 ARX 
models and perform the Granger Causality tests for the 20 regional blocks (i.e. 10 blocks for 
the Business Survey, 10 blocks for the Bank Survey). Likewise, we re-estimate the 8 ARX 
models and perform the Granger Causality tests for the 6 company size blocks (i.e. 3 blocks 
for the Business Survey, 3 blocks for the Bank Survey). 

Similar as for the industry results discussed in Section |5]^ we hnd that the business senti- 
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ment has more incremental predictive power compared to the bank sentiment. Furthermore, 
Germany’s largest geo-economical regions, Ruhr area and the Southern states, have most 
incremental predictive power for the macro-economic indicators to which their day-to-day 
business contributes most, i.e. IP-Al, IP-A2, IP-M, IP-E and IP-CaGo, IP-GoGo respec¬ 
tively. Finally, small- and medium-sized companies have more incremental predictive power 
than large companies. Germany is dominated by small- to medium-sized companies who 
are global market leaders in their segments, and, hence, those might be best at evaluating 
Germany’s economy. Detailed results are available from the authors upon request. 


6 Forecasting German macro-economic developments 


We perform a rolling-window forecast exercise using a window of length S = 30. For each 
time window, we estimate the 8 ARX models. We use the same selection and estimation 


techniques as in Section 4^, except for the standard Wald test and the ML estimator which 
are not available since the number of time series exceeds the time series length. Next, one- 
step-ahead forecasts are computed for t = S' -|- 1,..., T. We report the Mean Absolute 
Forecast Error, see equation ([^, for each macro-economic indicator and each selection- 
estimation technique combination in Table 

Among the selection techniques, the proposed Granger Lasso test performs best. It 
attains the lowest value of the MAFE in 20 out of 24 cases (84% of the cases). The MAFEs 
when either all industries are used or when Granger Lasso Selection is used are close to each 
other. It turns out that the latter (overall) does not discard any of the industry blocks. In 
contrast, a much more parsimonious model is obtained using the Granger Lasso test. These 
parsimonious models lead to an improved forecast accuracy, in the majority of cases. 

For the Adaptive Lasso, the Granger Lasso test leads to the lowest MAFE for 7 out of 8 
macro-economic indicators. The MAFEs with the Granger Lasso test are, on average, 40% 
lower compared to the other selection techniques. After the first selection step where either 
an entire block of business or bank sentiment indicators is selected or not, the Adaptive 
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Table 7: Mean Absolute Forecast Error for the three selection techniques (rows), the three 
estimation techniques (columns), and the 8 macro-economic indicators (blocks). 

Selection technique Response Estimation technique Response Estimation technique 




Adaptive Lasso 

Bayesian 

Factor Model 


Adaptive Lasso 

Bayesian 

Factor Model 

All 

IP-Al 

1.460 

0.921 

1.275 

IP-CaGo 

2.734 

1.892 

3.147 

Granger Lasso Selection 


1.460 

0.921 

1.275 


2.734 

1.892 

3.147 

Granger Lasso test 


1.138 

0.962 

0.937 


3.707 

1.834 

2.926 

All 

IP-A2 

1.462 

0.817 

1.207 

IP-CoGo 

1.142 

0.609 

0.918 

Granger Lasso Selection 


1.462 

0.817 

1.207 


1.142 

0.609 

0.918 

Granger Lasso test 


0.567 

0.640 

1.006 


0.777 

0.617 

0.915 

All 

IP-M 

1.720 

1.117 

1.641 

RT 

2.025 

1.109 

1.723 

Granger Lasso Selection 


1.720 

1.117 

1.641 


2.025 

1.109 

1.723 

Granger Lasso test 


1.688 

1.090 

1.342 


1.140 

1.035 

1.510 

All 

IP-E 

2.237 

1.171 

2.105 

ws 

1.524 

0.530 

0.800 

Granger Lasso Selection 


2.237 

1.171 

2.105 


1.524 

0.530 

0.800 

Granger Lasso test 


1.249 

0.959 

1.601 


0.566 

0.685 

0.677 


Lasso allows some of the time series belonging to a one of the selected blocks to be discarded 
in this second stage. Further reducing the number of relevant predictor time series within 
the selected blocks improves forecast accuracy. 

In line with the results of our simulation study, pre-selecting based on the Granger Lasso 
test is less favorable for the Bayesian shrinkage estimator compared to the other estimation 
techniques. Nevertheless, the Granger Lasso test in combination with the Bayesian shrinkage 
estimator still leads to the lowest MAFE for 5 out of 8 macro-economic indicators, with an 
average reduction in MAFE of 10%. 

For the Factor Model, the Granger Lasso test consistently leads to the lowest MAFE. 
The MAFEs with the Granger Lasso test are, on average, 20% lower compared to the other 
selection techniques. Discarding the least predictive industry blocks in this high-dimensional 
data set and estimating the factors based on the most predictive industry blocks thus leads to 


important gains in forecast accuracy. This result is in line with Bai and Ng (2008) who hnd 
important gains in forecast accuracy from diffusion index models by not using all predictors 
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but by using fewer, informative predictors. 


Robustness checks. We investigate the robustness of the results to the choice of seg¬ 
mentation criterion. We repeat the same forecast exercise using the region segments and 


company size segments instead of the industry segments (cfr. Section 5.3). The conclusions 
obtained with either the industry, region or company size segments are very similar. For 
the regional segments, the Granger Lasso test is the best performing selection technique and 
attains the lowest value of the MAFE in 71% of the cases (17 out of 24). Similarly for the 
company size segments where the Granger Lasso test leads towards the lowest MAFE in 71% 
of the cases (17 out of 24). Detailed results are available from the authors upon request. 


7 Discussion 

This paper presents a high-dimensional Granger Gausality test. It detects the most pre¬ 
dictive industry segments for future macro-economic developments. For this purpose, we 
use both business and bank sentiment surveys answered by hrms across Germany. Not 
all industry-specihc sentiment indicators are equally predictive for all macro-economic in¬ 
dicators. Industries contain most predictive power for the macro-economic indicators most 
closely tied to their day-to-day business activities. 

Our forecast exercise shows that important gains in forecast accuracy can be obtained 
by not using all industry segments, but by hrst selecting the most predictive ones using the 
Granger Lasso test. This selection of the most pertinent industry segments provides impor¬ 
tant information for institutes conducting these sentiment surveys. For instance, instead of 
equally spreading respondents among all segments, the number of respondents in predictive 
segments could be increased, whereas the number of respondents in non-predictive segments 
could be decreased. Alternatively, non-predictive segments could even be completely dis¬ 
carded, which provides an opportunity to obtain cost savings. 

The identihcation of pertinent respondents also applies to consumer sentiment surveys. 
In the large literature on consumer sentiment, this topic has received little attention. We 
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perform a similar exercise as described in this paper using a consumer sentiment survey data 
set from the National Bank of Belgium. Sentiment indicators are available for different classes 
of consumers’ net disposable income, profession, employment status, education, age and 
gender. We study their predictive power for several retail trade indicators. The profession, 
education, and age sentiment indicators contain most predictive power. Again, important 
gains in forecast accuracy can be obtained by first selecting the most predictive sentiment 
indicators (for a specihc target variable of interest) instead of using all indicators. 

In our sentiment application, the Business Survey contains more predictive power than 
the Bank Survey. Future research could further deepen our understanding on the usefulness 
of bank sentiment. It would be interesting to investigate if this sentiment differs between, 
for instance, countries that are more or less severely hit by banking crises, and developed 
or developing countries. The study of sentiment with respect to the banking sector opens a 
rich area of new research on sentiment surveys. 
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